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A calculational framework is proposed for phylogenetics, using nonlocal quantum field 

theories in hypercubic geometry. Quadratic terms in the Hamiltonian give the underly- 
ing Markov dynamics, while higher degree terms represent branching events. The spatial 
dimension L is the number of leaves of the evolutionary tree under consideration. Mo- 
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mentum conservation modulo 1^ L in L <— 1 scattering corresponds to tree edge labelling 
using binary L-vectors. The bilocal quadratic term allows for momentum-dependent rate 
constants - only the tree or trees compatible with selected nonzero edge rates contribute to 
the branching probability distribution. Applications to models of evolutionary branching 



processes are discussed. 



Evolutionary processes are frequently represented as discrete or continuous time sta- 
tionary Markov dynamics on some relevant set of system characters. Divergence events 
correspond to the initiation of two or more sibling processes, which each inherit the char- 
acter probability distribution of the progenitor and then continue to evolve. It is the task 
of phylogenetic inference to deduce ancestral interrelationships given observed character 
probability distributions. 

Although the individual ingredients for modelling such branching trees are quite well 
understood (see for example |], [J), to date there is no overall dynamical picture for phy- 
logenetics. In this note we point out that existing tools from physics - namely, quantum 
field theory and quantum many body theory when suitably interpreted in a stochastic 
context 0, f|, ^| - can provide both a theoretical perspective and a calculational frame- 
work. Below we sketch a general outline of our proposed model; details will be published 
in a separate paper. 

Consider a theory with Hamiltonian of the general form 7i(t) = Tio + 7ii(t), with 
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for quantised fields *£? a (x) of type a — 1, ... ,K. The sum is taken over vertices of a unit 
hypercube x, y G 1^ j an d the theory is manifestly translation invariant under x — > x + a, 
for a 6 Z 2 X . The interaction times tj are temporally ordered as = to < t\ < h < 
. . . < tu < tM+i = T where T is the total time for the evolution. As will be seen below, 
cubic interaction terms generate branching events, with the additional quadratic terms 
necessary to ensure that the theory is overall probability conserving 0. 

Quantisation is imposed in such a way that the time evolution generated by the 
quadratic Hamiltonian Tio reproduces the standard Markov dynamics on each mode of 
the field. Consider the following expansions in momentum space Z2 : 

MoP{x -y) = \{x - y)M a f3 , X(z) = ^ A(fc)e^, 

k 

* a (x) = J> i7rfe -c Q (fc). (2) 

k 

Basis states of the system are Fock states of the form 

\a x k\ a 2 k 2 . . . a N k N ) = cj.jfci)^^) . . . c ] aN (k N )\0), (3) 

where the vacuum is defined as usual by the property of being annihilated by the modes 
c a (k). For the evolution of states \P(t)) under the time independent Hamiltonian 7i , the 
solution of Schrodinger's equation 

j t \p(t)} = -n \p(t)} (4) 

for evolution after time T, namely 

|P(T)) = e-^|P(0)>, (5) 



must be computed with the help of the canonical commutation relations of the field. At 
this stage it is only necessary to impose the trilinear condition || 

J2lcUk)cp(k),c,(e)]=5%cp(£). (6) 

k 

Consider for example separable states such as 

\p(k x ,t)) ® \p{k 2 , t)) ® . . • <8> \p{k N , £)) (7) 

representing a number of processes evolving in parallel, with each \p(k, t)) a single-particle 
state corresponding to a probability distribution for characters of an individual process, 

\p(k,t)) = J2Pa(k,t)\ak). (8) 
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With (0), (H), either fermionic or bosonic quantisation lead to the time evolution of ((?]) 
such that the probability distribution of each individual mode is given by the solution of 
the appropriate classical master equation, 

p a (k,T) = (e- x ^ T - M )/pp(k,0) = U(k) a %(k,0). (9) 

Turning to the full, time-dependent Hamiltonian 7i(t) = Ho + T~ti(t), © must be 
replaced by the time ordered exponential 

\P(T)) = Te-tf dm(t) \P(0)), (10) 

which in turn is expressible in the usual way as sums of multiple integrals of time-ordered 
products ■ • ■ H(t')TC(t") ■ ■ ■. Consider in particular the L <— 1 process, and its evolution 
kernel representing the corresponding probability distribution of characters. Choose the 
distinct outgoing momenta in some ordering to be the simple binary vectors (0, 0, . . . , 1) 
(0, . . . , 1, 0), . . ., and (1, 0, . . . , 0) respectively. Since momentum conservation modulo Z 
must hold by translation invariance, this fixes the incoming momentum to be the maximal 
value (1,1,. ..,1). The probability distribution is then a sum over all terms generated 
by the expansion of the time ordered exponential. Contributions from admissible tree 
diagrams are enumerated by labelling edges with momenta k, with vertices for interaction 
times tj having one incoming and two outgoing momenta k, k', k" . Along edges, the 
probability distribution p a (k,t) evolves via (^) for the appropriate time intervals Ajj = 
(tj — tj) for I < J, so that the effective rate constant is n(k) = X(k)Ajj. At vertices, 
momentum conservation ensures that a particular character type splits with appropriate 
sharing of the probability and type between the two subsequent edges (with momenta such 
that k = k' + k"). A plausible description of the divergence event is V^ 7 = 6^6^, which 
means that the two sibling processes commence evolution on their respective edges with 
characters distributed identically to that of their progenitor. Clearly, the model admits 
further generalisation to nondiagonal or even trilocal or time-smeared interaction terms. 
Note that the additional diagonal quadratic terms in Tii(t) are necessary to ensure that 
the theory is overall probability conserving [H] but do not contribute to the tree diagrams 
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under consideration. The question of which tree or trees contribute to L <— 1 scattering 
is encoded in the bilocal form of 7io (see ([3])). Only momenta k corresponding to nonzero 
rate constants X(k) are allowed. For computation based on a given tree, it is thus possible 
to choose nonzero rate constants X(k) for selected momenta corresponding to the binary 
edge labelling unique to that tree's topology [0. 

As an illustration, consider the case L = 3, M = 2. Nonzero rate constants for the 
model ([[]) are chosen for the root and leaf momenta 7 = (111), 1 = (001), 2 = (010) and 
4 = (100) respectively, together with a single additional momentum 6 = (110) (see figure 
0). Write Hx{t) = V 1 5(t-t 1 ) + V 2 S(t-t 2 ). The time ordered exponential in flT^) may be 
written as a product, 

Te - J T dtH{t) = j e - /£. dtH(t) y 2 j e - /»+ dtH(t) y iTe - / 1_ dtH{t) ^ ^-Q 

where Vj are time ordered exponentials for small intervals 5i covering tj. These have the 
form 1 — TCoSi — V 1 + ■ ■ ■, the higher order terms being ordered monomials in TCq and Vj 
multiplied by nested (^-function integrals. In the limit 5i — > 0, 

J e -IodfH(t) = e -H (T-t 2 ) /j _ y2 _| jg-WoMi)^ _ yl _| ) e -«o*i. (12) 

Clearly the contribution to the 3 *— 1 scattering probability associated with the tree of 
figure [l] is, as required, the unique nonzero term arising from inserting intermediate states 
in the above with the correct intermediate edge momenta, giving finally 

(a r l a 2 -2 a z l \ e ^^t 2 ) V 2 ^n^t^yi e -n otl ^ Q)) = 

^UiKdc^UMa^Vfo* ■ U(K^U(^) a /%\^ ■ U(Kf)^ Paf (7,0). 

In phylogenetics, the probability distributions or dispersion tensors of characters of 
interest are given directly from observations. Whether these are compatible with calcula- 
tions for a specific tree remains a question of statistics. Our model (0) relates phylogenetic 
inference for evolutionary processes to a scattering problem for the associated quantum 
field theory. Recent work using Fourier- Hadamard inversion techniques for phylogenetic 
reconstruction in molecular phylogenetics ]7|, |J can be interpreted in our model as working 
with position states rather than in the momentum representation. 

The overall calculational framework provided by giving a definite dynamical model 
for the branching process has potentially wide applicability. The picture can be extended 
in practice by embellishment of various features. As already mentioned, these include 
for example vertex decorations. A further possibility is a perturbative expansion of the 
quadratic term to compute the modulation of systematic substitution frequency types by 
the effects of Poissonian background rates. Details of the model, and prospects for such 
extensions, will be published in a separate paper. 
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Figure 1: Binary labelling scheme for a tree on 3 leaves (L = 3) with branching events at 
intermediate times ti, t 2 . Nonzero rate constants for the model ([|) are chosen for the root 
and leaf momenta 7 = (111), 1 = (001), 2 = (010) and 4 = (100) respectively, together 
with a single additional momentum 6 = (110). 



